Suppression distance computation for hierarchical clusterings
نویسندگان
چکیده
We discuss the computation of a distance between two hierarchical clusterings of the same set. It is defined as the minimum number of elements that have to be removed so the remaining clusterings are equal. The problem of distance computing was extensively studied for partitions. We prove it can be solved in polynomial time in the case of hierarchies as it gives birth to a class of perfect graphs. We also propose an algorithm based on recursively computing maximum assignments.
منابع مشابه
MultiDendrograms: Variable-Group Agglomerative Hierarchical Clusterings
MultiDendrograms is a Java-written application that computes agglomerative hierarchical clusterings of data. Starting from a distances (or weights) matrix, MultiDendrograms is able to calculate its dendrograms using the most common agglomerative hierarchical clustering methods. The application implements a variable-group algorithm that solves the non-uniqueness problem found in the standard pai...
متن کاملMeasuring the Quality of Approximated Clusterings
Clustering has become an increasingly important task in modern application domains. In many areas, e.g. when clustering complex objects, in distributed clustering, or when clustering mobile objects, due to technical, security, or efficiency reasons it is not possible to compute an “optimal” clustering. Recently a lot of research has been done on efficiently computing approximated clusterings. H...
متن کاملOptimization and Simplification of Hierarchical Clusterings
Clustering is often used to discover structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. In general, a search strategy cannot both (1) consistently construct clusterings of high quality and (2) be computationally inexpensive. However, we can partition the search so that a sys...
متن کاملWhich, When, and How: Hierarchical Clustering with Human-Machine Cooperation
Human–Machine Cooperations (HMCs) can balance the advantages and disadvantages of human computation (accurate but costly) and machine computation (cheap but inaccurate). This paper studies HMCs in agglomerative hierarchical clusterings, where the machine can ask the human some questions. The human will return the answers to the machine, and the machine will use these answers to correct errors i...
متن کاملA General Paradigm for Fast, Adaptive Clustering of Biological Sequences
There are numerous methods that compute clusterings of biological sequences based on pairwise distances. This necessitates the computation of O(n) sequence comparisons. Users usually want to apply the most sensitive distance measure which normally is the most expensive in terms of runtime. This poses a problem if the number of sequences is large or the computation of the measure is slow. In thi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Process. Lett.
دوره 115 شماره
صفحات -
تاریخ انتشار 2015